Identify creative dishes: Sushi Sanwiches (all code version)

Deep Learning assignment with public available data

Ángel Martínez-Tenor

September 2018 (Last Updated in May 2021)

Table of Contents

Description

Goal: Identify samples that could potentially be considered as a combination of two dishes given their pictures

Input: Two separated folders with pictures of each class. The example provided here uses a dataset with 402 pictures of sandwiches and 402 pictures of sushi. Link

Only the best model obtained is shown here: MobileNet with input size (224,224) pretrained with Imagenet with a small fully connected classified trained and tuned with this data.

This implementation is largely influenced and reuses code from the following sources:

Setup

Load the Data

Explore and Process the Data

Visualize the data

Split the data into training and validation sets (not enough data for 3 partitions)

Create image generators with data augmentation

Build and train the Neural Network model

Load a well-known model pretrained on Imagenet dataset (only convolutional layers)

Get bottleneck features

Biuld a final fully connected classifier

Train the Classifier with the bottleneck features

Build the full model (pretrained bottleneck + custom classifier)

Make Predictions and get Results

Potential Dishes = pictures misclassified or with output (sigmoid) $\in$ (0.45, 0.55). Only the validation set is used here to avoid trained samples

Analysis of results and & Future work

The best model obtained, based on transfer learning with a pretrained MobileNet, achieved accuracies between 89-92% on the validation set. Less than 80% of accuracy was obtained with smaller custom convolutional models without transfer learning.

The generator of the augmented images used to train the classifier is based on the fact that the dishes are usually centered and photographed from different angles.

The identified potential dishes contain both actual potential combination and no combination at all. New potential dishes can be obtained by changing the 'SEED' parameter in the main script (different validation set).

Better accuracies of the classifier can be obtained by training with a large dataset or by fine-tuning the top layers of the pre-trained MobileNet network. However, it is likely that the identification of potential dishes does not improve.

Alternate advanced methods could include Style Transfer or using Generative Adversarial Networks for combining data, as RemixNet.